Databases in SQL Server 2008 span at least two, and
optionally several, database files. There must always be at least one
file for data and one file for the transaction log. These database files
are normal operating system files created in a directory within the
operating system. These files are created when the database is created
or when a database is expanded.
Each database file has the following set of properties:
A logical filename— This name is used for internal reference to the file.
A physical filename— This name is the actual physical pathname of the file.
An initial size—
If no size is specified for primary data file, its initial size, by
default, is the minimum size required to hold the contents of the model
database.
An optional maximum size— A maximum file size limit can be specified.
A file growth increment— This amount is specified in megabytes or as a percentage.
The information and properties about each file for a
database are stored in the database visible via the system catalog view
called sys.database_files. This view exists in every database and contains information about each of the database files. The master database contains a similar view, sys.master_files, that contains file information for all databases within the SQL Server instance. Table 1 lists the most useful columns in the sys.database_files view.
Table 1. The sysfiles Table
Column Name | Description |
---|
file_id | A file identification number that is unique within each database |
file_guid | GUID for the file |
type | File type (0=rows [that is, data files], 1=log, 2=FILESTREAM, 4=Full-text catalogs prior to SQL Server 2008 |
type_desc | Description of the file type (ROWS, LOG, FILESTREAM, FULLTEXT) |
data_space_id | 0 represents a log file; values > 0 represent the ID of the filegroup the data file belongs to |
name | The logical name of the file |
filename | The physical name of the file, including path |
state | File state (0 = OFFLINE, 1 = RESTORING, 2 = RECOVERING, 3 = RECOVERY_PENDING, 4 = SUSPECT, 6 = OFFLINE, 7=DEFUNCT) |
state_desc | Description of the file state (OFFLINE, RESTORING, RECOVERING, RECOVERY_PENDING, SUSPECT, OFFLINE, DEFUNCT) |
size | Current size of the file in 8KB pages |
max_size | Maximum file size in 8KB pages |
growth | File growth setting (0=fixed, >0=autogrow in units of 8KB pages or by percentage if is_percent_growth is set to 1) |
is_media_read_only | 1=file is on read-only media |
is_read_only | 1= file is marked read-only |
is_sparse | 1=file is a sparse file |
is_percent_growth | 1=growth of file value is percentage |
SQL Server uses the file location information visible in the sys.master_files
catalog view most of the time. However, the Database Engine uses the
file location information stored in the primary file to initialize the
file location entries in the master database when attaching a database using the CREATE DATABASE statement with either the FOR ATTACH or FOR ATTACH_REBUILD_LOG options.
Every database can have three types of files:
In addition, in SQL Server 2008, databases can also have FILESTREAM data files and full-text data files.
Primary Data File
Every database has only one primary database file. The location of the primary database file is stored in the master database (visible via the filename column in the sys.master_files
view). When SQL Server opens a database, it looks for this file and
then reads from the file information on the other files defined for the
database.
The file extension for the primary database file defaults to .mdf.
The primary database file always belongs to the default filegroup. It
is often sufficient to have only one database file for storing your
tables and indexes (the primary database file). The file can, of course,
be created on a RAID partition to help spread I/O. However, if you need
finer control over placement of your tables across disks or disk
arrays, or if you want to be able to back up only a portion of your
database via filegroups, you can create additional, secondary data files
for a database.
Secondary Data Files
A database can have any number of secondary files (in
reality, the maximum number of files per database is 32,767, but that
should be sufficient for most implementations). You can put a secondary
file in the default filegroup or in another filegroup defined for the
database. Secondary data files have the file extension .ndf by default.
Following are some situations in which the use of secondary database files might be beneficial:
You want to perform a partial backup. A
backup can be performed for the entire database or a subset of the
database. The subset is specified as a set of files or filegroups. The
partial backup feature is useful for large databases, where it is
impractical to back up the entire database. When recovering with partial
backups, a transaction log backup must also be available.
You
want more control over placement of database objects. When you create a
table or index, you can specify the filegroup in which the object is
created. This could help you spread I/O by placing your most active
tables or indexes on separate filegroups defined on separate disks or
disk arrays.
Creating multiple files on a
single disk provides no real performance benefit but could help in
recovery. If you have a 90GB database in a single file and have to
restore it, you need to have enough disk space available to create a new
90GB file. If you don’t have 90GB of space available on a single disk,
you cannot restore the database. On the other hand, if the database was
created with three files each 30GB in size, you more likely will be able
to find three 30GB chunks of space available on your server.
The Log File
Each
database must have at least one log file. The log file contains the
transaction log records of all changes made in a database . By default, log files have the file extension .ldf.
A database can have several log files, and each log
file can have a maximum size of 32TB. A log file cannot be part of a
filegroup. No information other than transaction log records can be
written to a log file.
File Management
In SQL Server 2008, you can specify that a database
file should grow automatically as space is needed. SQL Server can also
shrink the size of the database if the space is not needed. You can
control whether to use this feature along with the increment by which
the file is to be expanded. The increment can be specified as a fixed
number of megabytes or as a percentage of the current size of the file.
You can also set a limit on the maximum size of the file or allow it to
grow until no more space is available on the disk.
Listing 1
provides an example of a database being created with a 10MB growth
increment for the first database file, 20MB for the second, and 20%
growth increment for the log file.
Listing 1. Creating a Database with Autogrowth
CREATE DATABASE Customer
ON ( NAME='Customer_Data',
FILENAME='D:\SQL_data\Customer_Data1.mdf',
SIZE=50,
MAXSIZE=100,
FILEGROWTH=10),
( NAME='Customer_Data2',
FILENAME='E:\SQL_data\Customer_Data2.ndf',
SIZE=100,
FILEGROWTH=20)
LOG ON ( NAME='Customer_Log',
FILENAME='F:\SQL_data\Customer_Log.ldf',
SIZE=50,
FILEGROWTH=20%)
GO
|
The Customer_Data file has an initial size of 50MB, a maximum size of 100MB, and a file increment of 10MB.
The Customer_Data2 file has an initial size of 100MB, has a file growth increment of 20MB, and can grow until the E: disk partition is full.
The transaction log has an initial size of 50MB; the file increases by 20% with each file growth. The increment is based on the current file size, not the size originally specified.
When creating or expanding data files in SQL Server
2008, SQL Server uses fast file initialization. This allows for the fast
execution of the file creation and growth. With fast file
initialization, the space is added to the data file immediately, but
without initializing the logical pages in the data file with zeros. The
existing disk content in the data file is not overwritten until new data
is written to the files. This provides a huge performance advantage
when a data file autogrows while an application is attempting to write
data to the database. The application does not need to wait until the
space is initialized; it can begin writing to the database immediately.
SQL Server also provides an option to autoshrink
databases as well as manually shrink databases. However, shrinking a
database is a resource-intensive process and should be done only if it
is absolutely imperative to reclaim disk space. Also, if a data file is
constantly shrinking and growing, it can lead to excessive file
fragmentation at the file system level as well as excessive logical
fragmentation within the file, both of which can lead to poor I/O
performance.